gh-149101: Implement PEP 788#149116
Conversation
Documentation build overview
70 files changed ·
|
encukou
left a comment
There was a problem hiding this comment.
Thanks for adding these!
I'll send notes for Doc/ now; code review coming up.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
|
🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit bc78c10 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F149116%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
|
for buildbots: The RHEL8 failures aren't relevant. Refleaks are worrying though. |
Never mind; main currently leaks (#149179). |
Claude pointed this out as an issue. I think it makes some sense.
We cannot safely release a guard until PyThreadState_Delete() has been called, otherwise _PyThreadState_DeleteList can see an invalid thread state in the list. For example: 1. Thread A starts waiting on guards. 2. Thread B calls PyThreadState_Release(), which clears its thread state (making it invalid), and then releases the guard. 3. Thread B's thread state is still on the tstate list! 4. Thread A begins finalization now that there are no guards left. 5. Thread A calls _PyThreadState_DeleteList, which iterates over the thread state list, and then finds thread B's thread state. 6. Thread A tries to call PyThreadState_Clear() on thread B's thread state, but it has already been cleared! 7. Kaboom.
| @@ -2330,17 +2370,28 @@ make_pre_finalization_calls(PyThreadState *tstate, int subinterpreters) | |||
| int has_subinterpreters = subinterpreters | |||
There was a problem hiding this comment.
I think all these checks here should be done atomically without acquiring/releasing locks in between. So basically:
PyMutex_Lock(&interp->ceval.pending.mutex);
_PyEval_StopTheWorldAll(interp->runtime);
HEAD_LOCK(&_PyRuntime);
// ... do the checks and compare-exchange on finalization_guards
HEAD_UNLOCK(_PyRuntime);
_PyEval_StartTheWorldAll(interp->runtime);
PyMutex_Unlock(&interp->ceval.pending.mutex);
In other words, lift the HEAD_LOCK() out from runtime_has_subinterpreters and interp_has_threads
|
|
||
| // CPython detail: this struct is not defined anywhere; tokens are special | ||
| // sentinels or PyThreadState's. | ||
| typedef struct PyThreadStateToken PyThreadStateToken; |
There was a problem hiding this comment.
I stuck the pattern in ChatGPT and it agreed with @encukou that it's not UB, at least if struct PyThreadStateToken is kept opaque.
The typedef struct PyThreadStateToken PyThreadStateToken definition is fine with me, and will probably provide stricter type checking.
| _PyThreadState_Detach(tstate); | ||
| _PyEval_StartTheWorldAll(interp->runtime); | ||
| PyMutex_Unlock(&interp->ceval.pending.mutex); | ||
| _PyThreadState_Attach(tstate); |
There was a problem hiding this comment.
Is the current ordering here deliberate? I'm not sure if there's any issues, but the split of detach/attach across _PyEval_StartTheWorldAll and PyMutex_Unlock confuses me.
I would write it as:
_PyEval_StartTheWorldAll(interp->runtime);
PyMutex_Unlock(&interp->ceval.pending.mutex);
// Temporarily let other threads execute
_PyThreadState_Detach(tstate);
_PyThreadState_Attach(tstate);
But it's also not clear to me whey we need _PyThreadState_Detach/Attach here. If other threads are running, won't we wait on them in either wait_for_thread_shutdown or when parking on the finalization_guards?
There was a problem hiding this comment.
Yeah, I think that's an artifact of an older variation.
| ++attached_tstate->ensure.counter; | ||
| return NO_TSTATE_SENTINEL; |
There was a problem hiding this comment.
There is a bug here (found by Claude) where PyThreadState_Ensure/PyThreadState_Release with an already attached detaches on release when it should keep the active thread state.
In other words, we aren't properly distinguishing the two cases "already has a valid active thread state" and "doesn't have any attached thread state".
https://gist.github.com/colesbury/b6538312a898dd97fdd770b22fb3c338
| Failure indicates that the process is out of memory or that the main | ||
| interpreter has finalized (or never existed). | ||
|
|
||
| Using this function in extension libraries is strongly discouraged, because |
There was a problem hiding this comment.
We should use an affirmative tone. For example:
Use this function when an interpreter pointer or view cannot be supplied by the caller, for example, when a native threading library does not provide a void *arg parameter that could carry a :c:type:
PyInterpreterGuardor :c:type:PyInterpreterView. In code that supports subinterpreters, prefer :c:func:PyInterpreterView_FromCurrentso the guard tracks the calling interpreter rather than the main one.
| // Yield the processor to other threads (e.g., sched_yield). | ||
| extern void _Py_yield(void); | ||
| // Exported for _testembed. | ||
| PyAPI_FUNC(void) _Py_yield(void); |
There was a problem hiding this comment.
I'm not sure I understand the purpose of this in the tests. If the intent is to try to capture timing bugs, use something like pysleep. You can copy-paste those few lines of code if necessary.
I'd prefer we don't export this symbol since that complicates potential future bug fixes if we need to change or remove it.
There was a problem hiding this comment.
I'll just remove it. My hope was that this would make some of the apparent thread safety issues reproduce more reliably, but it doesn't seem that it did.
| if (detached_gilstate != NULL && detached_gilstate->interp == interp) { | ||
| /* There's a detached thread state that works. */ | ||
| assert(attached_tstate == NULL); | ||
| ++detached_gilstate->ensure.counter; |
There was a problem hiding this comment.
I think this also has a bug. Let's say you have:
t1 = PyThreadState_Ensure(main_interp_guard); // counter 0 -> 1
Py_BEGIN_ALLOW_THREADS
t2 = PyThreadState_Ensure(main_interp_guard); // counter 1 -> 2
PyThreadState_Release(t2); // BUG! 2->1
Py_END_ALLOW_THREADS
PyThreadState_Release(t1);
The inner PyThreadState_Release should restore the thread state to "not attached", but it leaves it as attached due to the counter.
There was a problem hiding this comment.
Hm, this looks tricky to solve. I guess we could turn PyThreadStateToken into an actual structure that holds the necessary state, but I'm not super excited about adding another heap allocation to PyThreadState_Ensure. I'll have to think about this one a little bit.
There was a problem hiding this comment.
Ok, my solution was to add a new sentinel that says to detach the thread state upon calling PyThreadState_Release, regardless of the counter.
Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
…_with_destructor.
…tor. This also allows us to move it to _testcapi.
|
🤖 New build scheduled with the buildbot fleet by @ZeroIntensity for commit a34dbc6 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F149116%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
Hugo has graciously given me permission to backport this if we don't make the May 5th deadline, but let's try to get this done in time!
I will write a full tutorial and migration guide once this is merged; I want to first make sure that this lands before the beta freeze.